High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library
نویسندگان
چکیده
The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high-level programming abstractions with competitive performance on multi-GPU systems. SkelCL extends the OpenCL standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for particular application examples, discuss their efficient parallel implementation, and report experimental results on systems with multiple GPUs. Our evaluation of three real-world applications shows that stencil code written with SkelCL is considerably shorter and offers competitive performance to hand-tuned OpenCL code. Electronic version of an article published as Parallel Processing Letters, Volume 24, Issue 03, September 2014, 17 pages. DOI: 10.1142/S0129626414410059 c ©World Scientific Publishing Company, Journal URL: http://www.worldscientific.com/worldscinet/ppl
منابع مشابه
Extending the SkelCL Skeleton Library for Stencil Computations on Multi-GPU Systems
The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA, which makes it a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high level of programming abstractio...
متن کاملHigh-Level Programming for Medical Imaging on Multi-GPU Systems Using the SkelCL Library
Application development for modern high-performance systems with Graphics Processing Units (GPUs) relies on low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs. In this paper, we present SkelCL – a high-level programming model for systems with multiple GPUs and its implementation as a library on top of OpenCL. SkelCL provides three mai...
متن کاملSkelCL: Enhancing OpenCL for High-Level Programming of Multi-GPU Systems
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-39958-9_24. Abstract. Application development for modern high-performance systems with Graphics Processing Units (GPUs) currently relies on low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs. In this paper, we present SkelCL – a high-level programmi...
متن کاملUsing the SkelCL Library for High-Level GPU Programming of 2D Applications
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-36949-0_41. Abstract. Application programming for GPUs (Graphics Processing Units) is complex and error-prone, because the popular approaches — CUDA and OpenCL — are intrinsically low-level and offer no special support for systems consisting of multiple GPUs. The SkelCL library offers pre-implemented recurrin...
متن کاملGPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on GPUs
Spatial blocking is a critical memory-access optimization to efficiently exploit the computing resources of parallel processors, such as many-core GPUs. By reusing cache-loaded data over multiple spatial iterations, spatial blocking can significantly lessen the pressure of accessing slow global memory. Stencil computations, for example, can exploit such data reuse via spatial blocking through t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Parallel Processing Letters
دوره 24 شماره
صفحات -
تاریخ انتشار 2014